Patient Perceptions of Chatbot Supervision in Health Care Settings

This survey study assesses whether patients communicating with a chatbot in a large health care system were able to accurately identify it as an unsupervised computer application.


Introduction
Conversational agents built on artificial intelligence (AI), known as chatbots, are being implemented for patient-facing communication in health care systems. 1 Emerging research focuses on how users perceive chatbots' anthropomorphic features 2 and what qualities promote humanlike interactions. 3mited evidence exists about patients' understanding of chatbot supervision (ie, whether chatbots are operated or monitored by humans in real-time).To inform implementation of chatbots in health systems, we assessed patient perceptions of chatbot supervision in a health care setting.

Methods
This survey study followed the American Association for Public Opinion Research (AAPOR) reporting guideline.We surveyed chatbot users who sent at least 3 messages back and forth with a large health system chatbot between July 2022 and September 2023.Internal electronic health record data suggested users would be predominantly White (76.8%); because of the chatbot avatar's design, users may perceive the chatbot to be a White non-Hispanic female.Therefore, we designed a sampling strategy aimed at comparing perceptions of White non-Hispanic users to other races and ethnicities (not population-level estimates).Of the 65 de novo survey questions (eMethods in Supplement 1), 2 questions asked about perceived chatbot supervision.Participants were recruited by email, and they completed surveys and written consent online via REDCap.The study was approved by the Colorado Multiple Institutional Review Board.
Phase 1 (n = 142) used simple random sampling and revealed that most users and respondents were White.Therefore, to enable subgroup comparisons, in phases 2 (n = 298) and 3 (n = 177) we oversampled other races and ethnicities vs White non-Hispanic users in a 1:2 ratio.
We used multivariable logistic regression to estimate odds of correctly identifying the chatbot as an unsupervised software application, as a function of self-reported education, race and ethnicity, sex, income, and age.Two-sided P Յ .05 was considered statistically significant.Analyses were conducted using Stata version 18.1 (StataCorp) from October to December 2023.
In the logistic regression model (Table ), users with more than a 4-year degree had approximately 6 times higher odds (odds ratio [OR], 5.97 [95% CI, 3.03-11.74];P < .001) of correctly identifying the chatbot as unsupervised compared with respondents with high school education or less.The odds of correctly identifying the chatbot as an unsupervised computer were approximately 1.6 times higher (OR, 1.60 [95% CI, 1.08-2.36];P = .02)for White non-Hispanic users compared with other races and ethnicities (Figure).

+ Supplemental content
Author affiliations and article information are listed at the end of this article.c Other sex category was removed from regression analysis due to small sample size.

Open
d Other race encompasses individuals who identified as the following race categories: American Indian or Alaska Native, Black or African American, Asian Indian, Chamorro, Chinese, Filipino, Japanese, Korean, Native Hawaiian, Other Asian, Other Pacific Islander, Samoan, Vietnamese.
e Hispanic encompasses individuals who identified as the following ethnicity categories: Cuban; Mexican, Mexican American, Chicano; Other Hispanic, Latino, or Spanish origin; Puerto Rican.

JAMA Network Open | Ethics
Patient Perceptions of Chatbot Supervision in Health Care Settings

Table .
Multivariable Logistic Regression Results of Correctly Identifying Chatbot Supervision (N = 581) a a Thirty-six participants removed due to missingness.bFigure.Forest Plot of Adjusted Odds Ratios of Correctly Identifying Chatbot Supervision b Totals vary based on covariate missingness; 1 participant deleted due to missing outcome response.